Skip to content

feat(server): stream monolithic blob PUT straight to object store#7

Open
tonicmuroq wants to merge 1 commit into
mainfrom
feat/push-streaming-blob
Open

feat(server): stream monolithic blob PUT straight to object store#7
tonicmuroq wants to merge 1 commit into
mainfrom
feat/push-streaming-blob

Conversation

@tonicmuroq

@tonicmuroq tonicmuroq commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Problem

persistMonolithicUpload (the PUT /v2/<name>/blobs/<digest> path — the only push path our clients use) spooled the entire blob to a disk tempfile via the chunked-upload session machinery, hashed it, then read it back and uploaded to the object store. So a push was:

client ──> epoch: write whole blob to /var/cache/epoch/uploads tempfile (+ sha256)
                     THEN read it back ──> upload to GCS

Two full passes, ~2x disk I/O, and receive and upload run serially. For multi-GiB VM disk/memory blobs this was the bulk of push time (single PUTs were taking 2–4 min).

Change

A monolithic PUT already knows the digest up front (it's in the URL), so there's no reason to buffer. Stream the request body through a sha256 hasher straight into a concurrent multipart upload (minio ConcurrentStreamParts, bounded at PartSize*NumThreads = 64 MiB × 4), verify the digest once the stream drains, delete on mismatch.

  • No disk spool; receive and GCS upload overlap; multipart instead of a single stream.
  • Server-side sha256 verification preserved — hashed inline, mismatch → object deleted + DIGEST_INVALID.
  • No client change, no trust-model change. Bytes still transit epoch (this fixes the spool/serial waste; it does not move epoch out of the data path — that's presigned-PUT-direct, a separate change).
  • The chunked PATCH path still spools to disk (its digest is only known at finalize); no first-party client uses it.

Measured live (deployed to cocoonstack-us, image redirect-stream-20260618)

Server-side PUT durations (pure upload-through-epoch — most accurate):

upload result
real snapshot push win10-20260618-2 (~10.4 GB, 5 layers) — big layer A 201 in 61s
same push — big layer B 201 in 39s
same push — small layers / manifest 201 in 125–465 ms
synthetic 128 MB blob 201 in 1.6s (~78 MB/s)
128 MB with wrong digest 400 DIGEST_INVALID (verify + delete works)
  • No disk spool, confirmed live: the upload-spool dir stayed empty (4.0K) sampled 3× during a multi-GB PUT → bytes stream straight through, nothing buffered whole. This is the core fix.
  • Footprint during push: CPU < 1 core (peak ~775m), mem ~10 MiB (bounded by multipart part buffers, not blob size).
  • The old double-buffered path took ~3–4 min for comparable big blobs; now 39s / 61s.
  • End-to-end bake push (CI wall-clock) measured ~45–49 MB/s vs ~27 MB/s before ≈ 1.7×. (End-to-end is lower than the per-blob server rate above because it includes client-side cocoon snapshot export + per-blob sha256 buffering, not just the network upload.)

Remaining ceiling: bytes still transit epoch (TLS in + multipart out on ~3 CPU), so big layers cap ~80–110 MB/s. GCS-direct push (presigned PUT) is a possible follow-up.

Relation to #6

Independent of #6 (blob GET redirect). Together: pull bypasses the proxy entirely (#6), push stops double-buffering and parallelizes the GCS leg (this). Fully removing epoch from the push data path (presigned PUT direct to GCS, first-party-trust + GCS-CRC32C) is a possible follow-up.

persistMonolithicUpload spooled the entire blob to a disk tempfile (via
the chunked-upload session machinery), hashed it, then read it back and
uploaded to the object store — two full passes plus ~2x disk I/O, and
receive/upload run serially. For multi-GiB VM disk/memory blobs that's
the bulk of a push (single PUTs were taking 2-4 min).

Monolithic PUT already knows the digest up front (it's in the URL), so
there's no need to buffer: stream the request body through a sha256
hasher straight into a concurrent multipart upload (no disk), verify the
digest once the stream drains, and delete on mismatch so the
content-addressed key never keeps unverified bytes. Server-side digest
verification is preserved; no client change.

The chunked PATCH path (no first-party client uses it) still spools to
disk, since its digest is only known at finalize.

GCS S3-compat streaming multipart validated against staging: 12 MiB in
3 concurrent 5 MiB parts, sha256 round-trip verified.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant